AITopics | bias-variance tradeoff

Collaborating Authors

bias-variance tradeoff

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Markovian Interference in Experiments

Neural Information Processing SystemsApr-24-2026, 08:34:39 GMT

We consider experiments in dynamical systems where interventions on some experimental units impact other units through a limiting constraint (such as a limited supply of products). Despite outsize practical importance, the best estimators for this'Markovian' interference problem are largely heuristic in nature, and their bias is not well understood.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Transportation > Ground > Road (0.94)
Transportation > Passenger (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Data Science (0.68)

Add feedback

Variational Garrote for Sparse Inverse Problems

Lee, Kanghun, Soh, Hyungjoon, Jo, Junghyo

arXiv.org Machine LearningMar-16-2026

Sparse regularization plays a central role in solving inverse problems arising from incomplete or corrupted measurements. Different regularizers correspond to different prior assumptions about the structure of the unknown signal, and reconstruction performance depends on how well these priors match the intrinsic sparsity of the data. This work investigates the effect of sparsity priors in inverse problems by comparing conventional L1 regularization with the Variational Garrote (VG), a probabilistic method that approximates L0 sparsity through variational binary gating variables. A unified experimental framework is constructed across multiple reconstruction tasks including signal resampling, signal denoising, and sparse-view computed tomography. To enable consistent comparison across models with different parameterizations, regularization strength is swept across wide ranges and reconstruction behavior is analyzed through train-generalization error curves. Experiments reveal characteristic bias-variance tradeoff patterns across tasks and demonstrate that VG frequently achieves lower minimum generalization error and improved stability in strongly underdetermined regimes where accurate support recovery is critical. These results suggest that sparsity priors closer to spike-and-slab structure can provide advantages when the underlying coefficient distribution is strongly sparse. The study highlights the importance of prior-data alignment in sparse inverse problems and provides empirical insights into the behavior of variational L0-type methods across different information bottlenecks.

artificial intelligence, data quality, machine learning, (18 more...)

arXiv.org Machine Learning

2603.12562

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > United States (0.04)

Genre: Research Report (0.69)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Bias-variance Tradeoff in Tensor Estimation

Kumar, Shivam, Xu, Haotian, Padilla, Carlos Misael Madrid, Khoo, Yuehaw, Padilla, Oscar Hernan Madrid, Wang, Daren

arXiv.org Machine LearningSep-23-2025

We study denoising of a third-order tensor when the ground-truth tensor is not necessarily Tucker low-rank. Specifically, we observe $$ Y=X^\ast+Z\in \mathbb{R}^{p_{1} \times p_{2} \times p_{3}}, $$ where $X^\ast$ is the ground-truth tensor, and $Z$ is the noise tensor. We propose a simple variant of the higher-order tensor SVD estimator $\widetilde{X}$. We show that uniformly over all user-specified Tucker ranks $(r_{1},r_{2},r_{3})$, $$ \| \widetilde{X} - X^* \|_{ \mathrm{F}}^2 = O \Big( κ^2 \Big\{ r_{1}r_{2}r_{3}+\sum_{k=1}^{3} p_{k} r_{k} \Big\} \; + \; ξ_{(r_{1},r_{2},r_{3})}^2\Big) \quad \text{ with high probability.} $$ Here, the bias term $ξ_{(r_1,r_2,r_3)}$ corresponds to the best achievable approximation error of $X^\ast$ over the class of tensors with Tucker ranks $(r_1,r_2,r_3)$; $κ^2$ quantifies the noise level; and the variance term $κ^2 \{r_{1}r_{2}r_{3}+\sum_{k=1}^{3} p_{k} r_{k}\}$ scales with the effective number of free parameters in the estimator $\widetilde{X}$. Our analysis achieves a clean rank-adaptive bias--variance tradeoff: as we increase the ranks of estimator $\widetilde{X}$, the bias $ξ(r_{1},r_{2},r_{3})$ decreases and the variance increases. As a byproduct we also obtain a convenient bias-variance decomposition for the vanilla low-rank SVD matrix estimators.

inequality follow, matrix, nullx 1, (15 more...)

arXiv.org Machine Learning

2509.17382

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

When three experiments are better than two: Avoiding intractable correlated aleatoric uncertainty by leveraging a novel bias--variance tradeoff

Scherer, Paul, Kirsch, Andreas, Taylor-King, Jake P.

arXiv.org Artificial IntelligenceSep-5-2025

Real-world experimental scenarios are characterized by the presence of heteroskedastic aleatoric uncertainty, and this uncertainty can be correlated in batched settings. The bias--variance tradeoff can be used to write the expected mean squared error between a model distribution and a ground-truth random variable as the sum of an epistemic uncertainty term, the bias squared, and an aleatoric uncertainty term. We leverage this relationship to propose novel active learning strategies that directly reduce the bias between experimental rounds, considering model systems both with and without noise. Finally, we investigate methods to leverage historical data in a quadratic manner through the use of a novel cobias--covariance relationship, which naturally proposes a mechanism for batching through an eigendecomposition strategy. When our difference-based method leveraging the cobias--covariance relationship is utilized in a batched setting (with a quadratic estimator), we outperform a number of canonical methods including BALD and Least Confidence.

artificial intelligence, experiment, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.04363

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Differences-in-Neighbors for Network Interference in Experiments

Peng, Tianyi, Ye, Naimeng, Zheng, Andrew

arXiv.org Machine LearningMar-3-2025

Experiments in online platforms frequently suffer from network interference, in which a treatment applied to a given unit affects outcomes for other units connected via the platform. This SUTVA violation biases naive approaches to experiment design and estimation. A common solution is to reduce interference by clustering connected units, and randomizing treatments at the cluster level, typically followed by estimation using one of two extremes: either a simple difference-in-means (DM) estimator, which ignores remaining interference; or an unbiased Horvitz-Thompson (HT) estimator, which eliminates interference at great cost in variance. Even combined with clustered designs, this presents a limited set of achievable bias variance tradeoffs. We propose a new estimator, dubbed Differences-in-Neighbors (DN), designed explicitly to mitigate network interference. Compared to DM estimators, DN achieves bias second order in the magnitude of the interference effect, while its variance is exponentially smaller than that of HT estimators. When combined with clustered designs, DN offers improved bias-variance tradeoffs not achievable by existing approaches. Empirical evaluations on a large-scale social network and a city-level ride-sharing simulator demonstrate the superior performance of DN in experiments at practical scale.

estimator, experiment, interference, (15 more...)

arXiv.org Machine Learning

2503.02271

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Services (0.88)
Transportation > Passenger (0.87)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Neural Information Processing SystemsOct-11-2024, 13:06:45 GMT

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of \widetilde{O}(T {5/6}), while the best known lower bound is \Omega(T {1/2}) . Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of \widetilde{O}(T {2/3}) .

bandit smooth convex optimization, bias-variance tradeoff, widetilde, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

Classical Statistical (In-Sample) Intuitions Don't Generalize Well: A Note on Bias-Variance Tradeoffs, Overfitting and Moving from Fixed to Random Designs

Curth, Alicia

arXiv.org Machine LearningSep-27-2024

The sudden appearance of modern machine learning (ML) phenomena like double descent and benign overfitting may leave many classically trained statisticians feeling uneasy -- these phenomena appear to go against the very core of statistical intuitions conveyed in any introductory class on learning from data. The historical lack of earlier observation of such phenomena is usually attributed to today's reliance on more complex ML methods, overparameterization, interpolation and/or higher data dimensionality. In this note, we show that there is another reason why we observe behaviors today that appear at odds with intuitions taught in classical statistics textbooks, which is much simpler to understand yet rarely discussed explicitly. In particular, many intuitions originate in fixed design settings, in which in-sample prediction error (under resampling of noisy outcomes) is of interest, while modern ML evaluates its predictions in terms of generalization error, i.e. out-of-sample prediction error in random designs. Here, we highlight that this simple move from fixed to random designs has (perhaps surprisingly) far-reaching consequences on textbook intuitions relating to the bias-variance tradeoff, and comment on the resulting (im)possibility of observing double descent and benign overfitting in fixed versus random designs.

estimator, prediction, prediction error, (14 more...)

arXiv.org Machine Learning

2409.18842

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.44)
Research Report (0.40)

Industry: Education > Curriculum (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

03a9a9c1e15850439653bb971a4ad4b3-Paper-Conference.pdf

Neural Information Processing SystemsSep-25-2024, 10:06:50 GMT

dq estimator, estimator, experiment, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Transportation > Ground > Road (0.93)
Transportation > Passenger (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science (0.68)

Add feedback

Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

Chen, Annie S., Lee, Yoonho, Setlur, Amrith, Levine, Sergey, Finn, Chelsea

arXiv.org Artificial IntelligenceMay-25-2023

Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts. In some situations, target data labels may be expensive to obtain, so we may only have access to a limited number of target data points. To make the most of a very small target dataset, we propose a lightweight, sample-efficient approach that learns a diverse set of features and adapts to a target distribution by interpolating these features. Our approach, Project and Probe (Pro$^2$), first learns a linear projection that maps a pre-trained embedding onto orthogonal directions while being predictive of labels in the source dataset. The goal of this step is to learn a variety of predictive features, so that at least some of them remain useful after distribution shift. Pro$^2$ then learns a linear classifier on top of these projected features using a small target dataset. Theoretically, we find that Pro$^2$ results in more sample-efficient generalization by inducing a favorable bias-variance tradeoff. Our experiments on four datasets, with multiple distribution shift settings for each, show that Pro$^2$ improves performance by 5-15% when given limited target data compared to prior methods such as standard linear probing.

artificial intelligence, distribution shift, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.05441

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Africa (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hardware-efficient learning of quantum many-body states

Van Kirk, Katherine, Cotler, Jordan, Huang, Hsin-Yuan, Lukin, Mikhail D.

arXiv.org Artificial IntelligenceDec-12-2022

Efficient characterization of highly entangled multi-particle systems is an outstanding challenge in quantum science. Recent developments have shown that a modest number of randomized measurements suffices to learn many properties of a quantum many-body system. However, implementing such measurements requires complete control over individual particles, which is unavailable in many experimental platforms. In this work, we present rigorous and efficient algorithms for learning quantum many-body states in systems with any degree of control over individual particles, including when every particle is subject to the same global field and no additional ancilla particles are available. We numerically demonstrate the effectiveness of our algorithms for estimating energy densities in a U(1) lattice gauge theory and classifying topological order using very limited measurement capabilities.

artificial intelligence, machine learning, visible space, (17 more...)

arXiv.org Artificial Intelligence

2212.06084

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback